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^ \ In theory, quantum computers can efficiently simulate quantum physics, factor large 

j> \ numbers and estimate integrals, thus solving otherwise intractable computational problems. 

In practice, quantum computers must operate with noisy devices called "gates" that tend to 
destroy the fragile quantum states needed for computation. The goal of fault-tolerant quan- 
tum computing is to compute accurately even when gates have a high probability of error 

(N ; each time they are used. Here we give evidence that accurate quantum computing is possible 
\ with error probabilities above 3 % per gate, which is significantly higher than what was pre- 
viously thought possible. However, the resources required for computing at such high error 

O ! probabilities are excessive. Fortunately, they decrease rapidly with decreasing error proba- 
bilities. If we had quantum resources comparable to the considerable resources available in 
today's digital computers, we could implement non-trivial quantum computations at error 

43 probabilities as high as 1 % per gate. 

^| Research in quantum computing is motivated by the great increase in computational power 

offered by quantum computers. 13 There is a large and still growing number of experimental efforts 
whose ultimate goal is to demonstrate scalable quantum computing. Scalable quantum computing 
requires that arbitrarily large computations can be efficiently implemented with little error in the 
output. Criteria that need to be satisfied by devices used for scalable quantum computing have been 
specified by DiVincenzo. 4 One of the criteria is that the level of noise affecting the physical gates is 
sufficiently low. The type of noise affecting the gates in a given implementation is called the "error 
model". A scheme for scalable quantum computing in the presence of noise is called a "fault- 
tolerant architecture". In view of the low-noise criterion, studies of scalable quantum computing 
involve constructing fault-tolerant architectures and providing answers to questions such as the 
following: Ql: Is scalable quantum computing possible for error model 81 Q2: Can fault-tolerant 
architecture A be used for scalable quantum computing with error model 81 Q3: What resources 
are required to implement quantum computation C using fault-tolerant architecture A with error 
model 81 

To obtain broadly applicable results, fault-tolerant architectures are constructed for generic 
error models. Here, the error model is parametrized by an error probability per gate (or simply 
error per gate, EPG), where the errors are unbiased and independent. The fundamental theorem 
of scalable quantum computing is the threshold theorem and answers question Ql as follows: If 
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the EPG is smaller than a threshold, then scalable quantum computing is possible. 5 8 Thresholds 
depend on additional assumptions on the error model and device capabilities. Estimated thresholds 
vary from below 10~ 6 [5-8] to 3 x 10~ 3 [9], with 10" 4 [10] often quoted as the EPG to be achieved 
in experimental quantum computing. 

Many experimental proposals for quantum computing claim to achieve EPGs below 10~ 4 in 
theory. However, in the few cases where experiments with two quantum bits (qubits) have been 
performed, the EPGs currently achieved are much higher, 3 x 10~ 2 or more in ion traps 1112 and 
liquid-state NMR. 13 ' 14 The first goal of our work is to give evidence that scalable quantum comput- 
ing is possible at EPGs above 3 x 10~ 2 . While this is encouraging, the fault-tolerant architecture 
that achieves this is extremely impractical because of large resource requirements. To reduce the 
resource requirements, lower EPGs are required. The second goal of our work is to give a fault- 
tolerant architecture (called the "C^/Cq architecture") well suited to EPGs between 10~ 4 and 1CT 2 
and to determine its resource requirements, which we compare to the state of the art in scalable 
quantum computing as exemplified by the work of Steane. 9 

Fault-tolerant architectures realize low-error qubits and gates by encoding them with error- 
correcting codes. A standard technique for amplifying error reduction is concatenation. Suppose 
we have a scheme that, starting with qubits and gates at one EPG, produces encoded qubits and 
gates that have a lower EPG. Provided the error model for encoded gates is sufficiently well be- 
haved, we can then apply the same scheme to the encoded qubits and gates to obtain a next level of 
encoded qubits and gates with much lower EPGs. Thus, a concatenated fault-tolerant architecture 
involves a hierarchy of repeatedly encoded qubits and gates. The hierarchy is described in terms 
of levels of encoding, with the physical qubits and gates being at level 0. The top level is used 
for implementing quantum computations and its qubits, gates, EPGs, etc. are referred to as being 
"logical". Typically, the EPGs decrease superexponentially with number of levels, provided that 
the physical EPG is below the threshold for the architecture in question. 

The C^/Cq architecture differs from previous ones in five significant ways. First, we use the 
simplest possible error-detecting codes, thus avoiding the complexity of even the smallest error- 
correcting codes. Error correction is added naturally by concatenation. Second, error correction is 
performed in one step and combined with logical gates by means of error-correcting teleportation. 
This minimizes the number of gates contributing to errors before they are corrected. Third, the 
fault-tolerant architecture is based on a minimal set of operations with only one unitary gate, the 
controlled-NOT. Although this set does not suffice for universal quantum computing, it is possible 
to bootstrap other gates. Fourth, verification of the needed ancillary states (logical Bell states) 
largely avoids the traditional syndrome-based schemes. Instead, we use hierarchical teleportations. 
Fifth, the highest thresholds are obtained by introducing the model of postselected computing with 
its own thresholds, which may be higher than those for standard quantum computing. Our fault 
tolerant implementation of postselected computing has the property that it can be used to prepare 
states sufficient for (standard) scalable quantum computing. 

Basics. For an introduction to quantum information, computing and error correction, see [15]. The 
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unit of quantum information is the qubit whose states are superpositions a\0) + Qubits are 
acted on by the Pauli operators X = a x (bit flip), Z = a z (sign flip) and Y = a y = i<r x a z . The 
identity operator is /. One-qubit gates include preparation of |0) and |+) = (|0) + \ l))/y/2, Z- 
measurement (distinguishing between |0) and |1)), X-measurement (distinguishing between |+) 
and |-) = (|0) - \ l))/y/2), and the Hadamard gate (HAD, a|0) + f3\l) i-> a\+) + f3\-)). We use 
one unitary two-qubit gate, the controlled-NOT (CNOT), which maps |00) h-> |00), |01) i-> |01), 
1 10) i — ► 1 1 1), and 1 11) t— > 1 10) . This set of gates is a subset of the so-called Clifford gates, which are 
insufficient for universal quantum computing. 10 Our minimal gate set Q mm consists of |0) and |+) 
preparation, Z and X measurement and CNOT. Universality may be achieved with the addition of 
other one-qubit preparations or measurements, as explained below. The physical gates mentioned 
are treated as being implemented in one "step"; the actual implementation may be more complex. 

The C^/Cq architecture is based on two error-detecting stabilizer codes, C 4 , a four-qubit code, 
and Cq, a three qubit-pair code, both encoding a qubit pair. A stabilizer code is a common 
eigenspace of a set of commuting products of Pauli operators (the "check operators"). Such prod- 
ucts are denoted by strings of X, Y, Z and /. For example, XIZ is a Pauli product for three 
qubits with X acting on the first and Z on the last. The shortest error-detecting code C 4 for qubits 
encodes two qubits in four and has check operators XXXX and ZZZZ. The encoded qubits 
(labeled L and S) are defined by encoded operators Xl = XXII, Zl = ZIZI, Xs = IX IX 
and Z s = IIZZ. We use this code as a first level of encoding and call the encoded qubits "level 
1" qubits. Level 1 qubits come in pairs, each encoded in a "block" of four physical qubits. The 
second code C& is constructed as a code on three qubit pairs able to detect any error acting on 
one pair. It encodes a qubit pair and has check operators XIIXXX, XXXIIX, ZIIZZZ 
and ZZZIIZ acting on three consecutive qubit pairs. A choice for encoded operators for C 6 is 
X L = IX XII I, Z L = IIZZIZ, X s = XIX XI I, Z s = IIIZZI. This code is used for the 
second and higher levels of encoding. For example, the second level is obtained by using three 
level one pairs to obtain a level two pair. A level I qubit pair requires a block of 4 x physical 
qubits. The block structure is depicted in Fig. 1. 

The concatenation of error-detecting codes allows for a flexible use of error detection and 
correction. Given a joint eigenstate of the check operators, its list of eigenvalues is called the "syn- 
drome". The level I encoding has check operators that can be derived from the check and encoded 
operators of C4 and Cq. Ideally, the state of a level / qubit exists in the subspace with syndrome 
(all eigenvalues are +1). In the presence of errors this is typically not the case, so the state is 
defined only with respect to a current "Pauli frame" and an implicit recovery scheme. The Pauli 
frame is defined by a Pauli product that restores the error-free state of the block to the syndrome 
subspace. The implicit recovery scheme determines the Pauli products needed to coherently map 
states with other syndromes to the syndrome of the error-free state. Defining the level I state in this 
way makes it possible to avoid explicitly applying Pauli products for correction or teleportation 
compensation. 9 Error-detection and correction are based on measurements that retroactively deter- 
mine the syndrome of the state (the current syndrome has already been affected by further errors). 
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An error is detected when the syndrome differs from that expected according to the Pauli frame. 
In "postselected" quantum computing, the state is then rejected and the computation restarted. In 
standard quantum computing, the syndrome information must be used to update the Pauli frame. 
With the C^/Cq architecture, it is possible to do so at level 2 and above by the following method: 
First check the level 1 C4 syndromes of each block of four qubits. For each block where an error 
is detected, mark the encoded level 1 qubit pair as having an error. Proceed to level 2 and check 
the (encoded) C 6 syndrome for each block of three level 1 pairs. If exactly one of the level 1 
pairs has an error, use the C§ syndrome to correct it. This works because error-detecting codes 
can always correct an error at a known location. If not, mark the encoded level 2 pair as having 
an error unless none of the three level 1 pairs have an error and the Cq syndrome is as expected 
according to the Pauli frame. Continue in this fashion through all higher levels. For optimizing 
state preparation, we can replace the error-correction step by error-detection at the top few levels 
depending on context as explained below. 

Error model and assumptions. All error models can be described by inserting errors (which 
act as quantum operations) after gates or before measurements. We could model correlations 
between the errors by extending the errors' quantum operations to a common external environment. 
However, here we assume that errors are independent. We further assume that a gate's errors 
consist of applications of Pauli products with probabilities determined by the gate. Ideally, we 
would obtain a threshold that does not depend on the details of the probability distributions of 
Pauli products. This is too difficult with available techniques, so a depolarizing model is assumed 
for each gate: |0) (|+)) state preparation erroneously produces |1) (|— )) with probability e p . A 
binary (e.g. Z or X) measurement results in the wrong outcome with probability e m . CNOT 
is followed by one of the 15 possible non-identity Pauli products, each with probability e c /15. 
HAD is modified by one of the Pauli operators, each with probability e^/3. We further simplify 
by setting e c = 7, e m = e p = 4.7/15, = 47/5. This choice is justified as follows: 47/5 is 
the one-qubit marginal probability of error for the CNOT, and it is reasonable to expect that one- 
qubit gates have error below this. In fact, one-qubit gates have much lower error than CNOTs in 
experimental systems such as ion-traps and liquid-state NMR. As for preparation errors, if they are 
much larger than 47/15, then it is possible to purify prepared states using a CNOT. For example, 
prepare |0) twice, apply a CNOT from the first to the second and Z measure the second. Try 
again if the measurement outcome indicates |1), otherwise use the first state. The probability of 
error is given by 47/ 15 + 0(7 2 ), assuming that CNOT error is as above and measurement and 
preparation errors are proportional to 7. This also works for |+) preparation. To improve Z 
measurement, it is necessary to introduce an ancilla in |0), apply a CNOT from the qubit to be 
measured to the ancilla, and measure both qubits, accepting the answer only if the measurements 
agree. The error probability conditional on acceptance is again 47/ 15 + 0(7 2 ). Detected error is 
much more readily managed than undetected error. 16 In our architecture, the primary role of state 
preparation implies that the conditional error is typically more relevant. To improve measurement 
without possibility of rejection requires an additional ancilla 4 and CNOT with majority decoding 
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of the three measurement outcomes. However, the error probability is now 47/5 + 0("f 2 ). 

The error model used here is idealized and does not match the error behavior of physical qubits 
and gates. There are three notable differences. First, real errors include coherent rotations, but any 
error can still be expressed as a linear combination of Pauli products. The syndrome measurements 
serve to "collapse" the linear combinations, so that these errors can be managed. The main problem 
with such errors is that consecutive errors can add coherently rather than probabilistically, result- 
ing in more rapid error propagation. In principle, this problem can be eliminated by frequently 
applying known but random Pauli products, thus modulating the Pauli frame and reducing the like- 
lihood of coherent addition. This also has the beneficial effect of decoupling 17 weakly interacting 
environments. The second difference is that real errors on gates nearby in time and space have 
correlations. These correlations are expected to decay rapidly with distance, and their effect can 
be alleviated by known coding techniques such as block interleaving. 9 The third difference is that 
in many cases, qubits are defined by a subspace of a quantum system from which amplitude can 
leak. An example of this problem is photon loss in optical quantum computing. Leakage errors, 
particularly if undetected, can be problematic. One advantage of error-correcting teleportation is 
that leakage is automatically controlled at each step. 

The error model does not specify "memory" error or the amount of time used by a measure- 
ment. 9 We assume that gates other than measurements take the same amount of time; that is, the 
error parameter should represent the total error including any delays for faster gates to equalize 
gate times. For the C^/Cq architecture, memory is an issue only when waiting for measurement 
outcomes that determine whether prepared states are good, or that are needed after teleportation, 
particularly when implementing non-Clifford gates. 1819 If the architecture is used for postse- 
lected computing, we can compute "optimistically", anticipating but not waiting for measurement 
outcomes. The output of the computation is accepted only if all measurement outcomes are as 
anticipated. Consider standard quantum computing with maximum parallelism. In teleportation, 
Bell measurements determine correction gates that need to be applied. If the correction gates are 
simple Pauli products, they can be absorbed into the Pauli frame and do not need to be known 
immediately. For teleportations used to implement non-Clifford gates, a non-Pauli compensation 
may be required. In this case, we must wait for the measurement outcomes. To avoid accumu- 
lating memory errors, we can maintain the logical state by repeatedly applying error-correcting 
teleportations with delays whose memory error is equivalent to that of physical one-qubit gates. 
The logical errors for these steps are comparable to logical one-qubit gates, which are small by 
design. The measurement outcomes for the teleportations determine only Pauli-frame updates and 
need not be known immediately. In state preparation, using additional resources, we may continue 
computing optimistically without waiting for measurement outcomes until the state is ready to be 
used for a logical gate. At this point it is necessary to wait for measurement outcomes to make 
sure the prepared state has no detected uncorrectable error. This adds at most the memory error 
incurred during a measurement time to each qubit. To account for this we assume that gate errors 
are set high enough to include this memory error. 
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Two additional assumptions are used in analyzing the C 4 / C 6 architecture: The first is that there 
is no error and no speed constraint on classical computations required to interpret measurement 
outcomes and control future gates. The second is that two-qubit gates can be applied to any pair 
of qubits without delay or additional error. This assumption is unrealistic, but the effect on the 
threshold is due primarily to relatively short-range CNOTs acting within the ancillas needed for 
maintaining one or two blocks. This may be accounted for by use of a higher effective EPG. 9 ' 20 
Clifford gates for C 4 and C 6 . The codes C 4 , C 6 and their concatenations have the property that 
encoded CNOTs, HADs and measurements that act in parallel on both encoded qubits in a pair 
can be implemented "transversally" with physical qubit relabeling. For example, to apply an en- 
coded CNOT between two encoded qubit pairs, it suffices to apply physical CNOTs transversally, 
that is between corresponding physical qubits in the encoding blocks. HAD requires in addition 
permuting the physical qubits in a block, which can be done by relabeling without physical manip- 
ulations or error, see the supplementary information (S.I. Sect. B). The transversal implementation 
ensures that errors from the physical gates apply independently to each physical qubit in a block 
so that they can be managed by error detection or correction. We use two methods for encoded 
state preparation. The first yields low-error encoded |0) and |+) states and follows from the Bell 
state-preparation scheme needed for error correction (S.I. Sect. B). The second uses teleportation 
to "inject" physical states into encoded qubits. The resulting encoded state has error but can be 
purified. 

Error-correcting teleportation. To correct (more accurately, to keep track of) errors, we use 
error-correcting teleportation, which generalizes gate teleportation. 18 It involves preparing two 
blocks, each encoding a logical qubit pair so that the first pair is uniformly entangled with the 
second. Two such blocks form a "logical Bell pair", and its logical state is the "logical Bell state". 
Suppose that a logical Bell pair's error is as if each physical qubit were subject to independent 
error of order 7. A block used for logical computation can then be error-corrected by applying 
Bell measurements transversally between corresponding qubits in the computational block and the 
first block of the logical Bell pair. This is the first step of conventional quantum teleportation 21 and 
results in the transfer of the logical state to the second block of the logical Bell pair, up to a known 
change in the Pauli frame. The Bell measurement outcomes reveal the syndromes of the products 
of identical check operators on the two blocks. Provided that the combined errors from the two 
measured blocks are within the limits of what the codes used can handle, they can be determined 
to update the Pauli frame (S.I. Sect. A). Compared to the syndrome extraction methods of Steane, 9 
error-correcting teleportation involves only one step instead of at least two but requires preparing 
more complex states. 

Logical Bell state preparation. State preparation networks are detailed in S.I Sect. B. It is 
necessary to prepare logical Bell states so that any errors introduced are similar to independent 
physical one-qubit errors. We prepare such states by constructing encoded Bell states at each level, 
using them as a resource for constructing Bell states at the next level. An encoded Bell state can be 
obtained by preparing and verifying encoded |++) 1 00) in two encoded qubit pairs and applying an 
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encoded CNOT from the first to the second. The encoded CNOT is applied transversally but can 
introduce correlations between the first and second block. To limit these correlations to the current 
level, the subblocks are teleported using lower-level Bell states with error detection or correction 
depending on context. The remaining correlations do not appear to significantly affect logical 
errors but can be reduced by purification 22 or by entanglement swapping 23 with two encoded Bell 
states. A key observation for preparing encoded 1 00) and |++) is that for both C 4 and C 6 , they 
are close to cat states such as (|0 . . . 0) + |1 . . . \))/y/2. In the case of C 4 , encoded 1 00) and |++) 
are cat states on four qubits. For C 6 , they are parallel three-qubit cat states on three qubit pairs 
modified by internal CNOTs on two of the pairs. To prepare verified cat states we use a minimal 
variant of the methods of Shor 24 starting with Bell states. For the concatenations used here, the 
internal CNOTs can be implemented by relabeling the physical qubits. 

Low error logical Q mm gates. The first step in establishing fault tolerance of the C^/Cq archi- 
tecture is to implement logical Q mm gates with low EPGs. For the purpose of establishing high 
thresholds, we first consider postselected Q min computing. Postselected computing is like standard 
quantum computing except that when a gate is applied, the gate may fail. If it fails, this is known. 
The probability of success must be non-zero. There may be gate errors conditional on success, 
but fault-tolerant postselected computing requires that such errors are small. Purely postselected 
computing has little computational power, but we can use it to prepare states needed to enable 
scalable quantum computing. A fault-tolerant architecture for postselected computing can be im- 
plemented by use of the C^/Cq architecture without error correction, aborting the computation 
whenever an error is detected. We have used two methods to determine threshold values for 7 
below which fault-tolerant postselected Q mm computing is possible. The first involves a computer- 
assisted heuristic analysis of the conditional errors in prepared encoded Bell pairs. The analysis is 
described for a C 4 architecture in [25]. It requires that encoded Bell pairs are purified to ensure 
that errors are approximately independent between each Bell pair's two blocks. We obtained exact 
conditional errors for level 1 encoded Bell pairs and then heuristically bounded them from above 
with an error model that is independent between the two blocks. This independence implies that 
the error model for gates at the next level also satisfies strict independence, so the process can 
be repeated at each level to bound the conditional logical errors. With this analysis, thresholds of 
above 7 = 0.03 were obtained. The second method involves direct simulation of the error behavior 
of postselected encoded CNOTs with error-detecting teleportation at up to two levels of encoding 
and physical EPGs of .01 < 7 < .0375. The simulation method is outlined in S.I. Sect. E. The 
resulting conditional logical errors are shown in Fig. 2 and suggest a threshold of above 7 = 0.06 
by extrapolation. At 7 = 0.03, the logical preparation and measurement errors were found to be 
consistent with being below the threshold. 

Scalable Q mm computing with the C^/Cq architecture requires lower EPGs and the use of error 
correction to increase the probability of success to near 1. To optimize the resource requirements 
needed to achieve a given logical EPG, the last level at which error correction is used is dl levels 
below the relevant top level, where dl depends on context and 7. At higher levels, errors are only 
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detected. For simplicity and to enable extrapolation by modeling, we examined a fixed strategy 
with dl = 1 in all state-preparation contexts and dl = (maximum error correction) in the context 
of logical computation. The relevant top level in a state preparation context is the level of a block 
measurement or error-correcting teleportation of a subblock, not the logical level of the state that 
is eventually prepared. Each logical gate now has a probability of detected but uncorrectable error, 
and a probability of logical error conditional on not having detected an error. Fig. 3 shows both 
error probabilities up to level 4 for a logical CNOT with error-correcting teleportation and EPGs 
7 < 0.01. The data indicate that the threshold for this architecture is above 0.01. The logical 
preparation and measurement errors were found to be comparatively low. 

In designing and analyzing fault-tolerant architectures, particularly those based on concatena- 
tion, care must be taken to ensure that logical errors do not have correlations that lead to larger than 
expected errors when gates are composed. Such effects can be missed when inferring thresholds 
from analysis or simulation of just one level of concatenation. An additional complication is that 
the C^/Cq architecture's level / + 1 gates are not implemented solely in terms of level / gates. We 
therefore simulated the architecture at the highest levels possible. To verify that logical errors are 
sufficiently uncorrected, we simulated sequential teleportation and checked the incremental error 
behavior of each step as shown in Fig. 4. 

Universal computation. To complete the Q mm gate set so that we can implement arbitrary quan- 
tum computations, it suffices to add HAD and preparation of the state 1 7r/8) = cos(-7r/8)|0) + 
sin(7r/8)|l). 26 ' 27 We treat the qubits in a logical qubit pair identically and ignore one of them for 
the purpose of computation. See S.I. Sect. D for how to take full advantage of both qubits. The 
logical HAD is implemented similarly to the logical CNOT and uses one error-correcting telepor- 
tation. Its logical errors are are less than those of the logical CNOT. To prepare logical \tt/S) in 
both qubits of a logical qubit pair, we obtain a logical Bell pair, decode the first block of the Bell 
pair into two physical qubits and make measurements to project the physical qubits' states onto 
|7r/8) or the orthogonal state. If an orthogonal state is obtained, we adjust the Pauli frame by Y's 
accordingly. Because of the entanglement between the physical qubits and the logical ones, this 
prepares the desired logical state, albeit with error. This procedure is called "state injection". To 
decode the first block of the Bell pair, we first decode the C4 subblocks and continue by decod- 
ing six-qubit subblocks of C 6 . Syndrome information is obtained in each step and can be used 
for error detection or correction. The error in decoding is expected to be dominated by the last 
decoding steps. Consequently, the error in the injected state should be bounded as the number of 
levels increase, which we verified by simulation to the extent possible. To remove errors from the 
injected states, logical purification can be used 27 ' 28 and is effective if the error of the injected state 
is less than 0.141. 28 The purification method can be implemented fault tolerantly to ensure that 
the purified logical |7r/8) states have errors similar to those of logical CNOTs (S.I. Sect. G). To 
simplify the implementation of quantum computations, other states can be prepared similarly. 

Consider the threshold for postselected universal quantum computing. The logical HAD and 
injection errors at 7 = 0.03 and level 2 are shown in Fig. 2. The injection error is well below 
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the maximum allowed and is not expected to increase substantially for higher levels. The injection 
error should scale approximately linearly with EPG, so the extrapolated threshold of 7 > 0.06 may 
apply to universal postselected quantum computing. 

The injection and purification method for preparing states needed to complete the gate set 
works with the error-correcting C^/Cq architecture. Consider state injection at 7 = 0.01. The con- 
text for injection is state preparation, which determines the combination of error-correction and 
detection as discussed above. The conditional logical error after state injection was determined to 
be 8.6±o| x 10~ 3 at level 3 and l.l±g;J x 10~ 2 at level 4, comparable to 7 and sufficiently low 
for 1 7r/8) purification. As a result, the C^/Cq architecture enables scalable quantum computing 
at EPGs above 0.01. To obtain higher thresholds, we use fault-tolerant postselected computing 
to prepare states in a code that can handle higher EPGs than C^/Cq concatenated codes can. The 
states are chosen so that we can implement a universal set of gates by error-correcting teleportation. 
Suppose that arbitrarily low logical EPGs are achievable with the C±/C§ architecture for universal 
postselected computing. To compute scalably, we choose a sufficiently high level I for the C^/Cq 
architecture and a very good error-correcting quantum code C e . The first step is to prepare the 
desired C e -encoded states using level I encoded qubits, in essence concatenating C e with level I of 
the C^/Cq architecture. The second step is to decode each block of the C^/Cq architecture to phys- 
ical qubits to obtain unconcatenated C e -logical states. Once these states are successfully prepared, 
they can be used to implement each logical gate by error-correcting teleportation. Simulations 
show that the postselected decoding introduces an error < 7 for each decoded qubit (Fig. 2). There 
is no postselection in error-correcting teleportation with C e , and it is sensitive to decoding error 
in two blocks (ps 27) as well as the error of the CNOT (ps 7) and the two physical measurements 
(ps 87/15) required for the Bell measurement. Hence, the effective error per qubit that needs to be 
corrected is ps 3.537. The maximum error probability per qubit correctable by known codes C e is 
ps 0.19 [29]. Provided that 3.537 ^ 0.19 and 7 is below the postselected threshold for the C 4 /C 6 
architecture, the error in the state preparation before decoding together with the logical error in 
error-correcting teleportation can be made smaller than 10~ 3 (S.I. Sect. G). The C e architecture 
can therefore be concatenated with the error-correcting C^/Cq architecture to arbitrarily reduce the 
logical EPG. In view of the postselected threshold indicated by Fig. 2, scalable quantum comput- 
ing is possible at 7 = 0.03 and perhaps up to 7 ps 0.05. Although the postselection overheads are 
extreme, this method is theoretically efficient. 

Resources The resource requirements for the error-correcting C^/Cq architecture can be mapped 
out as a function of 7 for different sizes of computations. Since we do not have analytical expres- 
sions for the resources for logical Bell state preparation or for the logical errors as a function of 7 
and, with our current capabilities, we are not able to determine them in enough detail by simulation, 
we use naive models to approximate the needed expressions. The resources required are related to 
the number of physical CNOTs used, which dominates the number of state state preparations and 
measurements. HADs are used only for universality at the logical level. The number of physical 
CNOTs used in a logical Bell state preparation is modeled by functions of the form C/(l — / y) k , 
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which would be correct on average if the state-preparation network had C gates of which k failed 
independently with probability 7, and the network were repeatedly applied until none of the k gates 
fail. C and k depend on the level of concatenation. The logical error probabilities are modeled 
at level I > 1 by pd(l) = d(Z)7^ +1 ) (detected error) and p c (t) = c(7)7^ /+2 ) (conditional logical 
error), where /(0) = 0, /(l) = 1, f(l + 1) = /(/) + f(l — 1) is the Fibonacci sequence. These 
expressions are asymptotically correct as 7 — > 0. We verified that they model the desired values 
well and determined the constants at the lower levels by simulation (S.I. Sect. G). At high levels, 
the constants were estimated by extrapolating their level-dependent behavior. Using these expres- 
sions, we determined the level of concatenation that requires the fewest resources to implement a 
computation of a given size. The resulting resource graph is shown in Fig. 5. 

Since interesting quantum computations use many non-Clifford gates, it is necessary to esti- 
mate the average resources required for preparing states such as the \n/8) state. One instance of 
this state suffices for implementing a 45° Y rotation. Two are required for a phase- variant of a Tof- 
foli gate. Consider 7 = 0.01. At level 4 of the C^/Cq architecture, one purification stage requires 
~ 370 logical CNOTs (S.I. Sect. G). It is likely that this overhead can be significantly improved, 
but it must be accounted for when using the graphs of Fig. 5 as discussed in the caption. It is pos- 
sible to implement a computation with 100 logical qubits and up to 1000 |vr/8) -preparations using 
1.23 x 10 14 physical CNOTs (S.I. Sect. G). This takes into account the probability that the com- 
putation fails with a detected error. The conditional probability of obtaining an incorrect output is 
~ 0.02. Such a computation is non-trivial in the sense that its output is not efficiently predictable 
using known classical algorithms. The resource requirements are large but would be reasonable in 
the context of classical computing: Central processing units have 10 8 or more transistors operating 
at rates faster than 10 9 bit operations per second. 30 

We compare our resource requirements to those of Steane's architecture based on an example 
at 7 = 10~ 4 detailed in [9]. Steane's architecture is based on non-concatenated block codes, which 
are expected to be more efficient at such low EPGs. 5 Steane's example has an effective logical 
error per qubit of ~ 7 x 10~ 12 using « 420 physical CNOTs per qubit per gate. Our architecture 
achieves detected errors of 5.5 x 10~ 9 (level 3) or 6 x 10~ 14 (level 4) using respectively ~ 2100 
or 2.6 x 10 4 physical CNOTs per qubit (S.I. Sect. G). The conditional logical errors are much 
smaller. The C^/Cq architecture's resource requirements are still within two orders of magnitude 
of Steane's at 7 = 10~ 4 . The C^/Cq architecture has the advantage of simplicity, of yielding more 
reliable answers conditional on having no detected errors, and of operating at higher EPGs. 
Discussion. How high must EPGs be so that it is not possible to scalably quantum compute? 
It is known that if unbiased one-qubit EPGs exceed .5, then we can simulate the effect of gates 
classically. 3132 Furthermore, if one-qubit EPGs exceed 0.25, then we cannot realize a quantum 
computation "faithfully", that is by encoding the computation's qubits with quantum codes. 33 34 
This is because the quantum channel capacity vanishes at a depolarizing error probability above 
0.25. Faithful techniques are likely to require at least three sequential gates before an error can be 
eliminated (in our case these are preparation gates whose errors remain in the logical Bell pairs, 
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a CNOT and a measurement for teleportation). Thus one would not expect to obtain thresholds 
above ~ 0.09 using faithful methods. This is not far from the extrapolated 0.05 evidenced by our 
work. Note that the thresholds obtained here are similar to those for quantum communication. 35 

An important use of studies of fault-tolerant architectures is to provide guidelines for EPGs that 
should be achieved to meet the low-error criterion for scalability. Such guidelines should depend 
on the details of the relevant error models and constraints on two-qubit gates. Nevertheless, the 
value of 7 = 10~ 4 has often been cited as the EPG to be achieved. With architectures such as 
Steane's 9 ' 36 and the one introduced here, resource requirements at 7 = 10~ 3 are now comparable 
to what they were for 7 = 10~ 4 [37] at the time this value was starting to be cited. 

Several open problems arise from the work presented here. Can the high thresholds evidenced 
by our simulations be mathematically proven? Are thresholds for postselected computing strictly 
higher than thresholds for scalable standard quantum computing? Recent work by Reichardt 36 
shows that Steane's architecture can be made more efficient by the judicious use of error detection, 
improving Steane's threshold estimates to around 10~ 2 . How do the available fault-tolerant archi- 
tectures compare for EPGs between 10~ 3 and 10~ 2 ? It would be helpful to significantly improve 
the resource requirements of fault-tolerant architectures, particularly at high EPGs. 
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FIGURE. 1: Block structure of C^/Cq concatenated codes. The bottom line shows 9 blocks of 
four physical qubits. Each block encodes a level 1 qubit pair with C4. The encoded qubit pairs 
are shown in the line above. Formally, each such pair is associated with two syndrome bits, shown 
below the encoded pair in a lighter shade, which are accessible by syndrome measurements or 
decoding for the purpose of error detection and correction. The next level groups three level 1 
qubit pairs into a block, encoding a level 2 qubit pair with Cq that is associated with 4 syndrome 
bits. A level 2 block consists of a total of 12 physical qubits. Three level 2 qubit pairs are used to 
form a level 3 qubit pair, again with C 6 and associated with 4 syndrome bits. The total number of 
physical qubits in a level 3 block is 36. 
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FIGURE. 2: Conditional logical errors with postselection. The plot shows logical CNOT errors 
conditional on not detecting any errors as a function of EPG parameter 7 at levels 0, 1 and 2. The 
logical CNOT is implemented with transversal physical CNOTs and two error-detecting telepor- 
tations, where the output state is accepted only if no errors are detected in the teleportations. The 
data show the incremental error attributable to the logical CNOT in the context of a longer com- 
putation, as explained in S.I. Sect. E. The error bars are 68 % confidence intervals. The solid lines 
are obtained by least-squares interpolation followed by gradient-descent likelihood maximization. 
Extrapolations are shown with dashed lines and suggest that logical EPG improvements with in- 
creasing levels are possible above 7 = 0.06. The error in the slope (4.32) of the level 2 line is 
estimated as 0.52 by resampling. The smallest number of undetectable errors at level 2 is 4, which 
should be the slope as 7 goes to 0. At high 7, the curves are expected to level off. 9 Other opera- 
tions' errors for 7 = 0.03 and level 2 are shown in the inset table. Ratios between the preparation 
or measurement and CNOT errors are smaller than those assumed for the physical error model. 
The logical HAD error is expected to be between 0.5 and 0.8 of the logical CNOT error, which 
could not be confirmed because of the large error bars. The decoding error is the incremental 
error introduced by decoding a block into two physical qubits. The injection error is the error in 
a logical state that we prepare by decoding one block of a logical Bell pair and measuring the 
decoded qubits. The measurement error per qubit is assumed to be the same as that of X- and 
Z-measurements. Decoding and injection errors were found to decrease from level 1 (decoding 
error 4.4±gi x 10~ 2 , injection error 5.5±!J;! x 10~ 2 ) to level 2. 
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FIGURE. 3: Conditional and detected logical errors with error correction. The plot shows incre- 
mental detected and conditional logical errors for a logical CNOT as a function of EPG parameter 
7 up to level 4. Error bars and lines are as described in the caption of Fig. 2. The data is obtained 
as described in S.I. Sect. E. The combination of error correction and detection is as required for 
the error-correcting C^/Cq architecture. Plot a. shows the logical CNOT's error conditional on not 
detecting an uncorrectable error. Plot b. shows the probability of detecting an uncorrectable error. 
At 7 = 0.01, the detected errors are are 2.4±g° x 10" 2 (level 3) and 2.4±i? x 10~ 3 (level 4). The 
conditional errors are 6.4±{J| x 10~ 4 (level 3) and 0.0±ao x 10~ 4 (level 4). For comparison, the 
preparation errors at levels 3,4 were found to be 2.1±gj x 10~ 4 , 0.0±J:!j x 10~ 4 (detected error) 
and 3.3±2:f x 10~ 6 , 0.0±£;[j x 10~ 4 (conditional error). The measurement errors are 4.7±o:t x 10~ 4 , 
5.6±I 2 6 8 x 10" 5 (detected error) and 3.3±I: 4 x 10" 6 , 0.0±J;g x 10" 4 (conditional error). Finally, the 
HAD errors at level 3 are 1.3±ao x 10~ 2 (detected error) and 3.5±SJ| x 10~ 4 (conditional error). 
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FIGURE. 4: Error-compounding behavior with and without error-correcting teleportation. Incre- 
mental conditional (a.) and detected (b.) error probabilities are shown for each step of a sequence 
of 30 steps of applying the one-qubit error associated with HAD to each physical qubit and tele- 
porting or not teleporting the logical qubit pair's block. Error bars are 68 % confidence intervals. 
Level 3 of the error-correcting architecture is used. The first step is omitted since it is biased by 
the error-free reference- state preparation as discussed in S.I. Sect. E. The horizontal gray lines 
show the average incremental error if teleportation is used. Note that for the first four steps, the 
incremental conditional error is smaller if no teleportation is used. This may be exploited when 
optimizing networks, provided one takes account of the resulting spreading of otherwise localized 
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FIGURE. 5: Resources per qubit and gate for different computation sizes. The size KQ of a 
computation is the product of the number of gates (including "memory" gates) and the average 
number of qubits per gate. Each curve is labeled by the computation size and shows the number 
pcnot of physical CNOTs required per qubit and gate to implement a computation of size KQ 
with the C^/Cq architecture. The curves are based on naive models of resource usage in state 
preparation and of the logical errors (S.I. Sect. G). The circled numbers are at the point above 
which the indicated or a higher level must be used. The curves are most reliable for levels < 4. 
To obtain the total computational resources, multiply pcnot x KQ by twice the average number of 
logical CNOTs needed for implementing a gate of the computation. It is assumed that these logical 
CNOTs are involved in state preparation required for universality but do not contribute to the error 
(S.I. Sect. G). The "scale-up" (number of physical qubits per logical qubit) depends on parallelism 
and level Z of concatenation. With maximum parallelism, the scale-up is of the same order as pcnot. 
For a completely sequential algorithm such as could be used if there is no memory error, this can be 
reduced to 3* -1 2. With some memory error and logical gate parallelism, ps (1 + 2 * (Z — 1))3 Z-1 2 is 
more realistic (S.I. Sect. G). The steps in the curve arise from increasing the number of levels. The 
first step is to level 2, and each subsequent step increments the level by 1. The steps are smoothed 
because we can exploit error-detection to avoid using the next level. Improvements of only one to 
two orders of magnitude are obtained by reducing 7 from 0.001 to 0.0001, compared to at least 
five orders by reducing 7 from 0.01 to 0.001. 
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Supplementary Information 



A Explanation of Error- Correcting Teleportation 

For the basic theory of stabilizer codes, see [15]. Let Q be the I x 2n binary check matrix with 
entries defining a stabilizer code on n qubits for encoding k = n — I qubits with good error- 
detecting or -correcting properties. The check matrix is obtained from an independent set of check 
operators Pi, . . . , Pi by placing a binary representation of P k into row k. The binary representation 
of P is obtained by replacing the Pauli operator symbols according to I i— > 00, X i— > 10, Z i— > 01 
and Y \— > 11. For example, the check operator XIY is represented by the row vector [100011]. 
Commas are omitted in binary vectors, and square brackets are used to distinguish them from 
binary strings. The syndrome of a joint eigenstate of the check operators is denoted by a binary 
column vector, with (1) in the A;'th position denoting a 1 (—1, respectively) eigenvalue of P^. The 
projection operator onto the eigenspace with syndrome x is denoted by H(Q, x). If a Pauli product 
P with binary representation g is applied to a state with syndrome x, the new syndrome x' is given 
by x' = x + QSg T . Arithmetic with binary vectors and matrices is modulo 2 and § is the 2n x 2n 



Consider an n qubit "input" block carrying I qubits encoded in the stabilizer code for Q, where 
the block has been affected by errors. An effective way of detecting or correcting errors is to 
teleport each of the n qubits of the input block using two blocks of n qubits that form an "encoded 
Bell pair". That is, both blocks have syndrome with respect to Q and corresponding qubits 
encoded in the two blocks are in the state ( 1 00) + 1 11))/ v2. The state of the two blocks is defined 
by the following preparation procedure: Start with n pairs of qubits in the standard Bell state 
(| 00) + 111))/^- The two blocks are formed from the first and second members of each pair, 
respectively. Use a Q-syndrome measurement on the n second members of each pair to project 
them into one of the joint eigenspaces of Q. Finally, apply identical Pauli matrices to both members 
of pairs in such a way as to reset the syndromes to 0. To teleport, apply the usual protocol to 
corresponding qubits in the three blocks. In the absence of errors, this copies the encoded input 
state to the second block of the encoded Bell pair. We show that errors are revealed by parities of 
the teleportation measurement outcomes. 

The standard quantum teleportation protocol begins with an arbitrary state l^) in qubit 1 and 
the Bell state ( |00) + |11) )/\/2 in qubits 2, 3. The global initial state can be viewed as \ip) 
encoded in the stabilizer code generated by IXX and IZZ, whose check matrix has rows bi = 
[001010] and b 2 = [000101]. Let be the check matrix whose rows are the bj. The stabilizer 
consists of the system-labeled Pauli products I,a x ^a x ^ 3 \a y ^ 2 'a y ^ and cr z • To teleport, 

one makes a Bell-basis measurement on the first two qubits. This is equivalent to making a B^ 12 '- 
syndrome measurement, where B {12) has as rows [101000] = [XXI] and [010100] = [ZZI]. This 
is identical to B^ with qubits 2, 3 exchanged for qubits 1, 2. Depending on the syndrome e that 



block-diagonal matrix with blocks 
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results from the measurement, one applies correcting Pauli matrices to qubit 3 to restore in 
qubit 3. 

Consider the teleportation of n qubits in a block as described above. The protocol is such that 
the 2n binary measurement outcomes linearly (with respect to computation modulo 2) determine 
the Pauli product correction to be applied to the second block of the encoded Bell pair. Let g be the 
binary representation of the Pauli product correction. The syndrome of the input block constrains 
g as shown in Fig. 6. The principle is as explained in [18] for unitary gates, but generalized to 
measurements. In this case, a stabilizer projection on the destination qubits before teleportation is 
equivalent to a projection after teleportation, where the syndrome associated with the projection is 
modified by the correction Pauli product used at the end of teleportation. The expression QE>g T 
must match the syndrome of the input block. Consequently, the syndrome of the input block can 
be deduced from g, a function of the teleportation Bell measurement. Errors can be detected or 
corrected accordingly. 

It is necessary to consider the effects of errors in the prepared encoded Bell pair. Errors on the 
second block propagate forward and must be handled by future teleportations. Because of the Bell 
measurement, errors on the first block have an effect equivalent to the same errors on the input 
block. Thus, using the inferred syndrome for detection or correction of errors deals with errors in 
both blocks, as long as their combination is within the capabilities of the code. 

Error correction or detection by teleportation handles leakage errors in the same way as other 
errors. If a qubit "leaked", the outcome of its Bell measurement becomes undetermined. The Bell 
measurement can be filled in arbitrarily, because for the purpose of interpreting the syndrome, the 
effect is the same as if a Pauli error occurred depending on how the measurement result is filled in. 

Note that as usual, none of the Pauli corrections actually have to be implemented explicitly. 
One can just update the Pauli frame as needed. 
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n input qubits 




n output qubits 







FIGURE. 6: Teleporting with an encoded entangled state is equivalent to a syndrome measure- 
ment. The gray lines are the time lines of blocks of n qubits. The boxes denote various operations. 
The Bell-state preparation on corresponding pairs of qubits in two blocks is depicted with a box an- 
gled to the right and labeled "Bell". The state used for teleportation in the top diagram is obtained 
after Bell-state preparation by projecting one of the blocks with n(<2,0) (the actual preparation 
procedure is different but has the same output). Projection operators are shown with boxes an- 
gled both ways with the operator written in the box. Bell measurement of corresponding pairs of 
qubits in two blocks is depicted with a box angled to the left and labeled "Bell". A Bell mea- 
surement on qubits 1 and 2 is implemented by applying a CNOT from qubit 1 to 2, performing 
an X-measurement on qubit 1 and a Z-measurement on qubit 2. The top diagram is the actual 
network implemented. The other two are logically equivalent. The Bell measurement outcome g 
is correlated with the effective projection in the bottom diagram. If the input state has a particu- 
lar syndrome, then only g for which the projection is onto the subspace with this syndrome have 
non-zero probababilities. 
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B Networks for C 4 and C 6 State Preparation and Gates 
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FIGURE. 7: Network elements. The elements shown represent networks acting on blocks of 
qubits. Blocks (shown by thick gray lines) may consist of only one physical qubit, so the elements 
can also represent physical gates. Elements with "fringes" are transversal gates: The indicated 
gate is applied to each physical qubit, or to corresponding physical qubits in the input blocks. The 
measurements have classical output indicated with a black line. Because they are transversal, the 
output contains as many bits as there are physical qubits. Because the codes used here are CSS 
codes, the check operators and the encoded Pauli operators contain only one type of non-identity 
Pauli operator. The output bits therefore contain both error-check information and the encoded- 
measurement answers. The *u and *u 2 elements are defined as shown for physical qubit pairs. The 
notation comes from a polynomial construction of C 6 as a code on three quaternary qudits using 
the four-element field GF(A). The symbol u denotes a third root of unity over GF{2). The gates 
transform Pauli operators by multiplication with u or u 2 in a GF(A) labeling of these operators. 
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FIGURE. 8: Encoded state preparations in terms of lower-level elements for O4. The lower-level 
blocks ("subblocks") can be either physical or encoded single qubits and are represented by the 
merging lines in the networks on the right. In the C^/Cq architecture, C 4 is used only at level 1, 
so the subblocks are always physical qubits. In this case, the output block contains an encoded 
qubit pair with each qubit in the pair in the state indicated by the preparation gate on the left. 
The physical states prepared are four-qubit cat states ((|0000) + 1 1 1 1 1 ) ) / v^2 in the case of the 
top network). If no error occurred, the four measurements in each network on the right have total 
parity 0. For any single error in the state preparation network, if this error results in an error 
in the output state that is not equivalent to a single physical qubit error, then the parity is 1, so 
this event can be detected. Thus, if the total measurement parity is 1, the output state is rejected. 
This ensures that errors occurring with linear probability in the EPGs introduce no undetectable 
errors. Note that the networks on the right begin with Bell-state preparations. The teleportation 
steps are not implemented on physical qubits but are included for generality. The encoded Z- 
and X-preparations shown assume that the next step is a transversal CNOT followed by subblock 
teleportations. Otherwise it may be necessary to teleport subblocks immediately to avoid error 
propagation. 



25 




FIGURE. 9: Encoded state preparations in terms of lower-level elements for Cq. The lower-level 
blocks ("subblocks") contain encoded qubit pairs. The beginning of the network prepares two 
parallel three-qubit cat states ( 1 000) + 1 1 1 1) ) /a/2 in the top network) on corresponding members 
of the encoded qubit pairs. The encoded measurements in the cat-state preparation satisfy the 
parity constraint described in the caption of Fig. 8 for each of the three corresponding qubits in the 
encoded qubit pairs. Because the measurements are implemented transversally, they also provide 
lower-level syndrome information that can be used for error detection or correction. Again, the 
networks begin with Bell pair preparations and the teleportations are only implemented on encoded 
qubit pairs. The last elements rotate the parallel cat states into Cq, so that the encoded qubit pair has 
both qubits in the desired state. Because the first level encoding uses C 4 , they can be implemented 
as simple permutations, which can be accomplished by logical relabeling without delay or error, 
see Fig. 10. As in Fig. 8, the encoded Z- and X-preparations shown assume that the next step is a 
transversal CNOT followed by subblock teleportations. Otherwise it may be necessary to teleport 
subblocks immediately to avoid error propagation. 
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FIGURE. 1 1 : Implementation of subblock teleportation. If the blocks are physical qubits or pairs 
of physical qubits, the teleportation elements on the left are not implemented. Otherwise the net- 
works shown on the right are used on the lower-level blocks. The top network is for C 4 and the 
bottom for Cq. Note that the networks look like traditional teleportation of each subblock. How- 
ever, if the subblocks are not physical, the encoded Bell states used imply that the teleportations 
are error-detecting/correcting for the subblocks. 



28 




FIGURE. 12: Implementation of encoded HADs for C4 and Cq. The top network is for C4 and 
is transversal except for an interchange of the middle two qubits. The bottom is for C 6 and is 
transversal. Using the HAD and CNOT implementations, it is also possible to implement the 
encoded conditional sign flip transversally up to a physical qubit permutation implementable by 
relabeling. 
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As can be seen, all preparation networks are based ultimately on Bell state preparation followed 
by full or half Bell measurements. As shown, the networks use teleportation fastidiously. It may 
be possible to delay teleportation in some cases, but this was not confirmed by simulation. For 
postselected computing, there is no need to wait for measurement outcomes before proceeding to 
the next steps. However, this delays the rejection of states found later to be faulty, which incurs a 
large resource cost if the probability of detecting an error is high. For standard quantum computing, 
this resource cost can be avoided by delaying further processing and incurring some memory error 
instead. If error correction is used, at higher levels the probability of unrecoverable error decreases 
rapidly so one can again proceed optimistically, before measurement answers are known. 

C Decoding C 4 and C 6 . 

There are two reasons to explicitly decode logical states encoded by concatenating C 4 and C 6 . 
First, at the highest EPGs, to implement a standard quantum computation with the postselected 
C^/Cq architecture requires preparing C^/Cq encoded states that are themselves states encoded in 
a code C e with very good error-correction capabilities. Once such a state is prepared, the C^/Cq 
concatenation hierarchy is decoded to obtain a physical block encoding a state in C e . Second, to 
implement arbitrary quantum computations requires preparing special encoded states that are not 
reachable using Q mm and HAD gates alone. These encoded states need not be error-free initially, 
since they can be purified using low-error logical Q mm and HAD gates. A way to prepare these 
states with error that is bounded independently of the number of levels is to prepare a logical Bell 
state in two blocks, decode the first block into two physical qubits, and make a measurement of 
the physical qubits to project them into the desired state. (Alternatively, but with more error, the 
measurement can be replaced by a teleportation of the desired state prepared in another pair of 
physical qubits.) The entanglement between the physical qubits and the logical ones in the second 
block ensures that the state is injected into the logical qubits. 

A good method for decoding the CJC e concatenation hierarchy is to decode "bottom up". 
That is, in the first step, the blocks of four physical qubits encoding qubit pairs in C 4 at the lowest 
level of the hierarchy are decoded. Syndrome information becomes available in a pair of ancillas 
for each block of C4 and can be used for error detection. In subsequent steps, six physical qubits 
encoding qubit pairs in C 6 are similarly decoded. Error information obtained in previous decoding 
steps can be combined with new syndrome information for error detection or correction. The C 4 
and Cq decoding networks are shown in Fig. 13. 
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Decoding C 4 : Decoding C 6 : 




FIGURE. 13: Decoding C4 (left) and C% (right). The gates are shown in plain form to indicate 
that they are physical, not encoded. The measurements reveal the syndrome and can be used 
for error detection. Error correction can be used if (a) the incoming qubits were decoded in an 
earlier step and (b) exactly one of them (for C4) or one pair of them ((1, 2), (3, 4) or (5, 6) for 
Cq) was detected to have an error. The C§ decoding can be simplified if the first level of the full 
concatenation hierarchy uses C4: The first step is a *u 2 operation on the first and last pair and can 
be implemented by relabeling before the level 1 blocks of C 4 are decoded. Even if the C 6 decoding 
is implemented with maximum parallelism and without waiting for measurement outcomes, it has 
an initial memory delay on qubits 5 and 6 that was not taken into consideration in the simulations. 
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D Universal Computing 



Universal computing with logical qubits encoded with C^/Cq can be accomplished by use of the 
logical Q mm gates, HADs and |7r/8) -state preparation. However, since these operations do not 
distinguish between the two logical qubits encoded in one block, computations are implemented 
on only one of the two logical qubits in each block. Because the other one experiences the same 
evolution, the computation's output is obtained twice each time it is run. It is desirable to be able 
to address the two logical qubits in a block separately and have the ability to apply a CNOT from 
one to the other. One operation that is already available is the *u gate and its inverse, which acts on 
a logical qubit pair as a swap followed by a CNOT. As with all stabilizer codes, it is also possible 
to apply arbitrary combinations of logical Pauli matrices by applying suitable products of physical 
Pauli matrices or by making a Pauli frame change. Fig. 14 shows how to use the gates mentioned 
to implement a set of gates that is sufficiently rich to address individual qubits in a pair. 
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FIGURE. 14: Implementing selective gates on qubit pairs. The networks are shown for physical 
qubits, but can be used with logical qubits by making the appropriate substitutions. The imple- 
mentations on the right use only gates that do not distinguish the qubits in a pair, *u, *u 2 and 
Pauli products. The top network shows how to implement any gate U selectively on one qubit in a 
pair. The implementation uses a selective Pauli operator and non-selective controlled-?/ gates. The 
bottom network shows how to implement a type of controlled phase gate between the two logical 
qubits in a pair. It uses a *u operation, a selective 90° z -rotation (which can be implemented using 
the top network) and a *u 2 operation. The CNOT (without swap) between the qubits in a pair can 
be implemented in terms of the controlled phase gate shown and selective one-qubit gates. 

The networks shown in Fig. 14 do not result in particularly efficient ways of implementing 
gates on individual logical qubits. An alternative is to inject and purify states needed for one- 
qubit teleportation of the desired gates using the techniques given in [19]. An example of such a 
state is |0)|+). Note that |0)|+) is much more readily purified than |vr/8). For example, to purify 
1 0)|+) one can apply the method suggested in the main text to reduce the preparation error. In 
the encoded setting, this requires a measurement of Z and X of the qubits in a pair, which cannot 
be done by a transversal encoded measurement. Instead, a third instance of |0)|+) is introduced 
and involved in a transversal Bell measurement with the block of the qubits to be measured. As 
in error-correcting teleportation, the desired information can be extracted from parities of the Bell 
measurements. At the same time, syndrome information that can be used for error detection and 
correction is obtained. 
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E Simulation of Error Behavior 



To simulate the error behavior of fault-tolerant methods based on stabilizer codes, we use the re- 
sult that computation with Clifford gates and feed-forward from Z- and X-measurements can be 
efficiently simulated. 10 The Clifford gates include Z- and X-state preparations and measurements, 
HAD, CNOT and 90° Z-rotations. Networks using these gates always result in stabilizer states, 
which are eigenstates of maximal sets of commuting check operators (the check matrix). Sim- 
ulation requires tracking a complete independent set of such check operators and the syndrome 
(which gives the state's eigenvalues with respect to the check operators). Check operators can be 
represented by binary vectors (see Sect. A). To simplify the computations required for updating the 
check matrix and syndrome after applying gates, we maintain it in "graph-state normal form". 39 ' 40 
In this form, each qubit has an associated "commuting" operator, which is either X or Z, and there 
is exactly one check operator acting on the qubit with an operator different from / or the commuting 
operator. In addition to the check matrix and the ideal syndrome, we maintain the Pauli products 
representing the current effect of errors (the "error vector") and the Pauli frame. The error vector 
is known only to the simulation, not to the user implementing a computation. The error vector and 
Pauli frame are updated with each operation. For efficiency, blocks that have not yet interacted are 
associated with separate check matrices and "merged" when needed. Also, since it is necessary 
to accumulate as much statistics as possible, an array of error vectors and corresponding Pauli 
frames is used to represent multiple simultaneous preparation attempts without duplicating check 
matrices. For rapid prototyping purposes and fast array processing, we used Octave to implement 
the simulator. For simulating measurements and errors, a random-number generator is needed. We 
used the standard random-number generator provided with Octave. Because this implies that there 
are implicit correlations in the errors for the large-scale simulations undertaken here, the results 
obtained do not constitute full statistical proof. However, no artifacts not explainable by statistics 
were observed. In particular, in the few cases where an analytic expression for the data were avail- 
able, the simulated data was as expected. This was checked for conditional error probabilities in 
postselected computing using concatenation with C 4 as discussed in [25] for up to two levels (data 
not shown). 

The simulations are used to determine the error behavior of various logical gates. For the 
data shown in Fig. 2, 3 and 4, we used the reference entanglement method 41 for determining log- 
ical CNOT error probabilities. This involves applying the logical CNOT and error-detecting or 
-correcting teleportations to the first members of two error-free logical Bell pairs and then com- 
paring the logical state to what would have been obtained if the logical CNOT had no error. The 
comparison is implemented by applying error-free CNOTs to disentangle the Bell pairs and mak- 
ing error-free logical X- or Z-measurements with error detection or correction depending on the 
context. The procedure was modified by (1) applying only the CNOT's physical error model as- 
sociated with the transversal implementation and (2) applying the error model and error-detecting 
teleportation twice and determining the incremental error introduced the second time. (1) simpli- 
fies the verification without affecting the error probabilities. (2) is required so as to determine the 
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effective error introduced in the middle of a computation, because the error-free Bell pairs have 
no initial error, contrary to what would be expected later. Using the second of two steps suffices 
because of the isolating properties of teleportation, which was verified by taking some data for 
more steps as shown in Fig. 4. For detected error probabilities, the incremental error is determined 
as the fraction of trials in which an uncorrectable error was detected during the teleportations or in 
the verifying measurements. For conditional error probabilities, the incremental error is the frac- 
tion of trials with no detected uncorrectable error for which the logical measurement outcomes are 
incorrect but there was no undetected logical error in the preceding steps. 

F Scalable Quantum Computing via Bootstrapping with Post- 
selection 

The fault-tolerant architecture based on a good quantum error-correcting code C e using the C 4 /C 6 
architecture with postselection for state preparation is described in the main text. We claimed that 
if 3.537 ^ 0.19 and 7 is below the threshold for fault-tolerant postselected computing with the 
C^/Cq architecture, then the logical errors for the C e architecture can be made to be below 10~ 3 , 
which is below the threshold for known fault-tolerant architectures. The estimate assumes that 
the decoding error per decoded qubit is « 7, in which case ~ 3.537 is m e effective error per 
qubit that determines whether the error-correcting teleportation successfully corrects. With this 
assumption, the claim is proven as follows. Choose e such that 3.537 ^ 0.19 — e. Choose C e 
such that if a logical qubit is encoded in C e without error and each physical qubit is independently 
subjected to an error with probability 0.19 — e, then the logical state can be recovered with error at 
most 10~ 3 /4. Such codes exist 29 although their length n grows as e goes to zero. Algorithms for 
encoding needed states in one to four blocks of C e require at most c\n 2 gates for some constant c\ 
[42]. Choose the level of the postselected C^/Cq architecture so that the logical gate error is well 
below 10~ 3 / (2cin 2 ). Then, before they are decoded, the postselected prepared states have logical 
error at most 10~ 3 /2, since they required fewer than cin 2 logical gates of the C^/Cq architecture. 
This error persists as a C e -logical error after the C e -error-correcting teleportation that uses this state 
after it is decoded. It adds to the logical error introduced by failure to error-correct in teleportation. 
However, because at most two error-correcting teleportations are involved, the total logical error is 
below 10~ 3 . Note that if 7 is given and strictly below the threshold, then the resources required to 
achieve C e -logical EPGs below 10~ 3 are determined. Because the fault-tolerant architectures that 
can be used with EPGs of 10~ 3 are known to be theoretically efficient, the combined architecture 
starting with C e is also theoretically efficient. The problem is that as 7 approaches the upper limit, 
the minimum length of the code C e grows, and as a result the probability of successfully preparing 
the required states by postselection goes down dramatically, making the combined architecture 
highly impractical. 



35 



G Resource Usage 



The simulations keep track of the number of operations of different types that are applied in the 
course of implementing a quantum network. The resources required depend on whether the net- 
works are implemented with maximum parallelism or sequentially: If they are implemented se- 
quentially, one can take advantage of the ability to abort some computations early, but such imple- 
mentations require quantum memory of sufficiently low error. Here we consider only the case of 
maximum parallelism. At the core of the fault-tolerant architecture is Bell pair preparation. One 
can analyze the resources required to construct a level / + 1 Bell pair in terms of the number of level 
/ Bell pairs consumed. As a first step, consider the case of zero EPG. In this case no error is ever 
detected and all networks succeed on the first try. We count only the number of physical qubit state 
preparations, p(l, 7), and the number of physical CNOTs, c(l, 7). The number of physical qubit 
measurements is less than the number of qubit state preparations. A level (physical) Bell pair re- 
quires p(0, 0) = 2 qubit state preparations and c(0, 0) = 1 CNOT A level 1 Bell pair requires four 
level Bell pairs and four CNOTs for preparing and verifying the initial state of each of the two 
blocks to be used. Combining the blocks requires another four CNOTs. Thus p(l, 0) = 8p(0, 0) 
and c(l, 0) = 8c(0, 0) + 12. For I > 1, a level / + 1 Bell pair requires three level / Bell pairs and 
three level I encoded CNOTs for preparing the initial state in each block. Combining the blocks to 
form the level / + 1 Bell pair requires 3 i_1 4 x 3 physical CNOTs. To remove lower level errors, 
each of the six level / subblocks is teleported. Each teleportation uses one level / Bell pair and 
CNOT. Thus p(l +1,0) = 12p(7, 0) and c(Z + 1,0) = 12c(/, 0) + 3'20. 

TABLE. 1 : Table of resources used for logical Bell pair preparation at EPG 7 = 0. 



Level 


Preparations 


CNOTs 





2 


1 


1 


16 


20 


2 


192 


300 


3 


2304 


3780 


4 


2.765xl0 4 


4.590 x 10 1 


5 


3.318xl0 5 


5.524xl0 5 


6 


3.981xl0 6 


6.634xl0 6 



With maximum parallelism, the average resource requirements increase by factors inversely 
related to the probability of success at various points in the preparation process. Preparing an en- 
coded Bell state involves two sequential steps that may fail. The first verifies the initial states of 
each block before they are combined with CNOTs. Let the probability of successful verification 
of a block at level I be given by v (1, 7). The second involves teleportation of each subblock after 
the two blocks are combined. Let the overall probability of success of the teleportations be t(l, 7). 
Note that both of these probabilities of success are with respect to the combination of error correc- 
tion and detection used in state preparation, which differs from the full error correction used in log- 
ical computation. The above resource formulas are modified as follows: p(0, 7) = 2, c(0, 7) = 1, 
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p(l, 7) = 8p(0, j)/v(l, 7), c(l, 7) = 8c(0, j)/v(l, 7) + 12, p(Z + 1, 7) = (6p(/, 7)M* + 1,7) + 
6p(Z, 7 ))/i(Z + 1, 7), c(Z + 1,7) = ((6c(Z, 7) + 3 , 12)/u(Z + 1,7) + 6c(/, 7) + 3'8)/*(Z + 1,7)- These 
formulas were obtained under the assumption that the verification of the two blocks proceeds in- 
dependently with many simultaneous attempts, where the successful ones are then combined. This 
requires waiting for measurement outcomes and any associated memory error must be accounted 
for in 7. The subblock teleportations are not independent because of the immediately preceding 
transversal CNOT, which introduces correlated errors. Tables 2, 3, 4 show the success probabilities 
up to level 5 for 7 = 0.01, 0.001, 0.0001 together with the resources estimated according to these 
recursive formulas and the resources determined by the simulation after averaging over the num- 
ber of attempts made. The simulation is expected to show higher resource requirements because it 
involves some loss when combining unequal numbers of independently prepared blocks, as would 
be expected to occur in a real implementation. This was not taken into account in deriving the 
formulas. 

TABLE. 2: Table of success probabilities and resources used for Bell state preparation at EPG 

7 = 0.01. 

The values v(l, 0.01), t(l, 0.01) and the numbers in the "preparations" and "CNOTs" columns are 
obtained by simulation using the number of successful Bell pair preparations shown in the "# Bell 
pairs" column. Because only two successful preparations were used at level 5, the level 5 data have 
significant noise. 



Level 


w (7, 0.01) 


£(7,0.01) 


p(Z,0.01) 


Preparations 


c(Z, 0.01) 


CNOTs 


# Bell pairs 





NA 


NA 


2 


2 


1 


1 


NA 


1 


0.940 


NA 


17.01 


17.01 


20.51 


21.01 


10345 


2 


0.722 


0.247 


984.6 


1022.2 


1485.5 


1542.7 


2279 


3 


0.602 


0.100 


1.58xl0 5 


1.60xl0 5 


2.40xl0 5 


2.45xl0 5 


409 


4 


0.885 


0.205 


9.84xl0 6 


9.63xl0 6 


1.50xl0 7 


1.48xl0 7 


70 


5 


0.900 


0.500 


2.49xl0 8 


3.08xl0 8 


3.80xl0 8 


4.72xl0 8 


2 


TABLE. 3: Table of success probabilities and resources used for Bell state preparation at EPG 








7 


= 0.001. 








Level 


v(l, 0.001) 


t(l, 0.001) 


p{l, 0.001) 


Preparations 


c(/, 0.001) 


CNOTs 


# Bell pairs 





NA 


NA 


2 


2 


1 


1 


NA 


1 


0.994 


NA 


16.10 


16.11 


20.05 


20.11 


10927 


2 


0.970 


0.870 


225.6 


226.6 


351.2 


352.8 


2014 


3 


0.957 


0.815 


3395.6 


3434.9 


5513.4 


5576.1 


401 


4 


1.000 


0.970 


4.20xl0 4 


4.67xl0 4 


6.88xl0 4 


7.61xl0 4 


64 


5 


1.000 


1.000 


5.04xl0 5 


5.61xl0 5 


8.27xl0 5 


9.17xl0 5 


2 



TABLE. 4: Table of success probabilities and resources used for Bell state preparation at EPG 

7 = 0.0001. 
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Level 


v(l, 0.0001) t(l, 0.0001) p(l, 0.0001) Preparations c(l, 0.0001) 


CNOTs 


# Bell pairs 





NA 


NA 


2 


2 


1 


1 


NA 


1 


0.999 


NA 


16.01 


16.02 


20.01 


20.02 


10987 


2 


0.995 


0.984 


195.7 


204.9 


305.6 


317.2 


2155 


3 


0.994 


0.982 


2398.6 


2556.0 


3929.9 


4158.1 


429 


4 


1.000 


1.000 


2.88xl0 4 


3.13xl0 4 


4.77xl0 4 


5.16xl0 4 


66 


5 


1.000 


1.000 


3.45xl0 5 


3.92xl0 5 


5.74xl0 5 


6.50xl0 5 


2 



Resources for implementing logical gates transversally are dominated by those required for 
logical Bell state preparation. For example, the logical CNOT includes error-correcting teleporta- 
tion and therefore requires two logical Bell states and three transversal CNOTs. The number of 
physical CNOTs in a transversal CNOT grows by a factor of 3 for each level after the first, whereas 
the number of physical CNOTs required for logical Bell state preparation grows by a factor greater 
than 12. This justifies focusing attention on the resources required for logical Bell state prepa- 
ration. The biggest resource overhead is incurred when implementing non-Clifford gates such as 
1 7r/8) -preparation (see below) or Toffoli gates. Note that two |7r/8) states are needed to implement 
a Toffoli gate up to a reversible phase in the logical basis, which is all that is required for most 
uses of Toffoli gates. We have not attempted to optimize |7r/8) -preparation. Furthermore, it is 
possible that gates such as the Toffoli gate can be implemented more efficiently using other states, 
for example, using Steane's adaptation 43 of Shor's method. 24 

For completeness and to obtain an upper bound on the requirements for a minimal non-trivial 
quantum algorithm at 7 = 0.01, we outline one method for preparing good logical 1 7r/8) states, 
discuss why the error when using these states is expected to be similar to that of one logical 
CNOT and estimate the average number of logical CNOTs required. A straightforward method 
for preparing a noisy logical |7r/8) -state is to prepare a logical Bell state, decode the first block 
and make a measurement in the basis |7r/8), |57t/8) = — sin(7r/8)|0) + cos(7r/8)|l) of each of 
the two decoded, now physical qubits. Note that |57r/8) differs from |7r/8) by a Y operator, so 
any measurement outcome is acceptable and can be accounted for by a change in Pauli frame if 
necessary. The simulations indicate that if the measurement has the same error probability as a 
Z- or X-measurement, then the error e^/g in the logical prepared state is near the EPG parameter 
7. To reduce the noise in the logical 1 7r/8) states, one can purify them. The simplest purification 
method known so far involves using 15 prepared 1 7r/8) states. One is encoded into the [[7, 1, 3]] 
code 44 (a code that encodes 1 qubit in 7 qubits with minimum distance 3, which implies that it can 
correct any (3 — l)/2 = 1 qubit error or detect any (3 — 1) = 2 qubit errors). The other 2x7 
|7r/8) = 14 states are used to implement a conditional logical HAD from an ancilla to realize an 
encoded HAD measurement. Note that |7r/8) is the +1 eigenstate of HAD. In the last step, the 
[[7, 1, 3]] code is decoded. If the measurement outcomes are as would be expected if no error had 
occurred, the state is accepted and has much reduced conditional error. The method is equivalent to 
Bravyi and Kitaev's scheme 28 (Reichardt, private communication) and can be analyzed using their 
formulas. With no error in the Q mm and HADs used to implement the procedure, the probability 
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of error in successfully purified gates is e^L = 35e^ 8 + 0(e*, 8 ). The probability of success is 

p^y 8 = 1 — 35e„-/ 8 — 0(e^y 8 ). Using the exact formulas, for e n / 8 = 0.01, 0.001, 0.0001 we obtain 

e W g = 3.6 x 10~ 5 ,3.5 x 10~ 8 ,3.5 x lO" 11 and pfy = 0.860,0.985,0.999. The purification can 
be iterated, but for the example considered below, the dominant errors are from the logical Q mm and 
HAD gates, so we use only one purification stage. 

Consider the effect of logical gate error on the error in the purified |vr/8). We conjecture that 
by using state injection with Steane's fault-tolerant methods for preparing states, the additional 
error on the purified |vr/8) state is dominated by a decoding error of the order of the logical CNOT 
error. Specifically, one can encode one noisy logical 1 7r/8) by teleportation into the [[7, 1, 3]] code 
using a Bell state correlating a logical qubit and a [[7, 1, 3]]-encoded qubit. (Strictly speaking, our 
architecture requires the use of logical qubit pairs associated with blocks of the C^/Cq codes, but 
we treat each qubit in a pair identically.) This Bell state has minimum distance 4, so that any 
combination of Pauli errors on up to three qubits results in an orthogonal state. It can therefore 
be well verified using Steane's methods. The error in the state teleported into the [[7, 1, 3]] code 
is due to the initially prepared |vr/8) state, initial error in the logical qubit of the Bell state used, 
and the CNOT and measurements needed for the teleportation Bell measurement. Because we are 
operating with logical qubits of the C4/ C 6 architecture, all but the first of these errors are com- 
paratively small, assuming that the C^/Cq encoding level is chosen so as to significantly decrease 
CNOT errors. The errors have two effects. One is to modify the encoded state, which can be 
subsumed by considering this as additional error in the initial 1 7r/8) state. The other is to perturb 
the syndrome of the encoded state. If two or fewer errors occurred, this can be detected in the 
decoding stage. The encoded state is verified using the controlled-HADs implemented with the 
other 14 noisy logical 1 7r/8) states. Each of these controlled-HADs involves at most five CNOTs. 27 
The error is dominated by that in the 1 7r/8) states used. Additional error due to the logical CNOT 
either has a smaller effect, to be detected in decoding, or results in the wrong outcome in the en- 
coded HAD measurement. The latter event could cause unintentional acceptance of the final state, 
but only if additional error occurred elsewhere. At the end of the procedure, the [[7, 1, 3]] -encoded 
qubit is decoded and the syndrome verified. One can decode directly or by reverse teleportation 
through the same type of Bell state used for the initial teleportation, verifying the syndrome in the 
teleportation process. The latter method may be more robust. In all cases, the effect of additional 
errors are either suppressed by the fault-tolerant methods used to encode and decode the [[7, 1, 3]] 
code, or can be subsumed as a relatively small amount of additional error in the initial 1 7r/8) states 
due to at most five logical CNOTs. Based on experience with C^/Cq codes, it is likely that the 
additional error from encoding and decoding is of the order of that of a logical CNOT, whereas 
the effective additional |vr/8) error should be sufficiently small (because of significant decrease in 
CNOT errors at the level chosen) to have little effect on the error in the purified | tt/8) . 

We estimate the number of logical CNOTs needed for the 1 7r/8) purification process. The Bell 
state needed for injection into the [[7, 1, 3]] code can be prepared from a logical Bell state by en- 
coding one of the two blocks into the [[7, 1, 3]] code. 1 1 CNOTs suffice for encoding. The resulting 
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state can be verified using Steane's methods. There are eight syndromes each of weight 4 to check, 
each requires an ancilla preparation with five CNOTs and four CNOTs for the syndrome check. If 
memory is an issue, it may be necessary to add error-correcting teleportations not associated with 
a gate. We do not consider this here but note that this may add another four logical Bell states 
per syndrome check to the resources required. If the robust decoding scheme is used, two of the 
injection Bell states are required overall. The verification process using controlled-HADs requires 
about 5x7 CNOTs. This gives a total of 201 CNOTs, but does not take into account the probabil- 
ity of failure in the various checks. We can estimate this probability as 1 — (1 — p) 201 , where p is 
the probability of detected error in a logical CNOT For the relevant parameters, p is below 0.003. 
Taking the average number of trials required due to logical gate failure as 1/(1 — p) 201 , we can 
upper bound the average number of CNOTs required as 370. 

An obvious optimization of the 1 7r/8) -purification method in the context of the fault tolerant 
C^/Cq architecture is to concatenate with the [[7, 1, 3]] code as a last level, lifting all logical states 
accordingly, but injecting |7r/8) states to the last Ci/C e level as before for \tt/S) purification pur- 
poses. This avoids having to decode the purified 1 7r/8) states while achieving significantly lower 
error probabilities. Within the C4/ C 6 scheme, if more than one purification stage is required, it may 
be worthwhile injecting and purifying states at intermediate levels before injecting and purifying 
at the top. 

As an example, consider 7 = 0.01, aiming for implementing a non-trivial quantum computa- 
tion. The smallest non-trivial quantum computation must be one involving more qubits than can 
be directly simulated on existing classical computers. 100 qubits is a safe number for this property. 
Such a quantum computation should also apply sufficiently many gates for a classical simulation 
with current computers not to be able to predict the output of the quantum computation by taking 
advantage of restrictions on the reachable states. If the number of gates applied involves suffi- 
ciently many parallel steps of non-Clifford gates involving all qubits, this is expected to be the 
case. Short of having an explicit example of a computation whose output is unknown and not be- 
lieved to be accessible to classical computers, we assume that 10 steps involving parallel CNOTs, 
HADs and 1 7r/8) -preparations suffice 1 . We therefore take 1000 as a minimal number of gates in a 
non-trivial quantum algorithm. Note that with EPGs of 0.01 it is not possible to combine this many 
physical gates and still expect that a computation's output can be discerned. If more than 68 phys- 
ical gates at this EPG are applied, the probability that the output is correct cannot be guaranteed to 
be strictly greater than 0.5. Although the output of a computation with such few gates may already 
be difficult to simulate with current classical computers, it is conceivably possible to do so. 

Consider level 4 of our scheme at 7 = 0.01. The detected error probability of a logical CNOT 
is pa = 2.4±J ? x 10~ 3 . The conditional probability of a logical error is much lower and estimated 
as p c = 2.3 x 10~ 5 (see below). The logical 1 7r/8) -purification method ensures that similar error 

'Finding a computation with as few as 100 qubits and fewer than 10 4 gates with a definite and convincing answer 
of interest independent of quantum information theory would be very helpful and could be a boon for quantum infor- 
mation processing. For comparison, all fully worked out computations of this sort seem to require that the number of 
gates greatly exceeds 10 9 . 
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probabilities apply to uses of these states in the algorithm. The probability that there is a detected 
failure in 1000 gates is 1 — (1 — p^) 1000 ~ 0.91. Thus, the algorithm needs to be applied 11.1 
times on average before a successful answer is obtained. Once an answer is obtained, its error 
probability is 1 — (1 — p c ) 1000 ~ 0.02. The average number of physical CNOTs required for 
obtaining an answer can now be estimated as 

1.23 x 10 14 = 1000 x ^370 x 1/0.09 (1) 

logical gates logical CNOTs per prob." 1 of 

in computation purified |tt/8) -prep overall success 



x 2 x 1.5 x 10 



7 



physical CNOTs 
per logical CNOT 

Although this number of physical CNOTs vastly exceeds current capabilities, it may be compared 
to typical resources available today for classical computation. For example, today's central pro- 
cessing units in desktop computers have more than 10 8 transistors and operate at rates above 10 9 
bit operations per second. 30 

Resource requirements for implementing a given computation decrease significantly with 7. 
Simulation is too inefficient for resolving the dependence of resource requirements on 7, particu- 
larly when error probabilities are extremely small. We therefore obtain and verify simple models 
for resources and errors as a function of 7 and level of concatenation. Ideally, we would like to 
obtain analytic expressions, however this is difficult to do, particularly since our schemes are not 
strictly concatenated, and the combination of error-detection and correction behaves differently 
depending on the level. Nevertheless, it is possible to derive functional forms for Bell state prepa- 
ration resources and logical CNOT error behavior that are asymptotically valid as 7 goes to 0. 

We model the number of physical CNOTs required for preparing logical Bell states at level / as 
rbell(/, 7 ) = P(Z)/(1 - 7) fc(0 . This is a naive model based on assuming that the resources are de- 
termined by applying a network with P(l) physical CNOT gates, k(l) of which fail independently 
with probability 7 each, and the network is repeatedly applied until no failure is detected. Perhaps 
surprisingly, this model matches the simulations well in the range shown in Fig. 15. 
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FIGURE. 15: Graphs of resources for logical Bell pair preparation. The points are obtained by 
simulation and counting the number of physical CNOTs used in preparing a number of logical Bell 
pairs at different values of 7. The error bars are not statistically rigorous. They are standard devi- 
ations computed from the number of prepared Bell pairs and assuming the naive model described 
in the text. The curves are least-squares fits to the data with functions of the form P{1) /(l — 7)^. 

To understand the error behavior of the C 4 /C 6 architecture, suppose more generally that we 
have a fault-tolerant scheme A for implementing an encoded gate, which results in a detected, 
uncorrectable error with probability p d , or an undetected logical error with probability p c , condi- 
tional on not having detected an error. Suppose that this is concatenated with a one-error detecting 
(minimum distance 2) code C and used in a scheme similar to the ones used here. C can correct 
any error at a known location. If the implementation of C-encoded gates is fault tolerant and in- 
cludes error-correcting teleportation or another method for determing the C-syndrome, any one 
error detected by A can be corrected with no resulting encoded error. The event that an error is 
detected but not correctable during implementation of a C-encoded gate therefore requires at least 
one undetected error or at least two detected errors. The conditional event that an undetected error 
occurs requires that the A gates used have one detected and one undetected error or two or more 
undetected errors. To lowest order, the detected and conditional error probabilities for C-encoded 
gates are therefore of the form p' d = D c p c + D d p 2 d and p' c = L c p 2 c + C d p d C c In our case, p d and 
p c depend on one parameter 7. After level 1, the order in 7 of p c is always between p d and p d , 
so the expressions can be simplified to p' d = Dp c , p' c = Cp d p c , to lowest order in 7. Let p d (l) 
and p c (l) be the detected and logical error probabilities at level / in the C^/Cq architecture. Us- 
ing the above, we can write p d (l + 1) = D(l)p c (l) and p c (l + 1) = C (l)p d (l)p c (l) . At level 1, 
PdiX) = ^(1)7 and j? c (l) = c(l)7 2 , to lowest order in 7. Expanding the recursion at higher lev- 
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els, we obtain p d (2) = d(2) 7 2 ,p c (2) = c(2)7 3 , p d (3) = d(3)7 3 ,p c (3) = c(3)7 5 . The Fibonacci 
sequence /(0) = 0, /(l) = 1, /(n + 1) = f(n) + /(n — 1) emerges as the relevant exponent so 
that p d (l) = d(l)'yf( l+1 \p c (l) = c(Z)7^ +2 \ As is typical of concatenation schemes, the exponent 
grows exponentially. 

In view of the previous paragraph, we examine the data shown in Fig. 3 to determine C(l),D(l) 
for I — 1, 2 and c(l), d(l) for / = 1,2, 3. The results are shown in Table 5. We computed the 
values of c(l) and d(l) by fitting the model curves to the error probabilities obtained by simulation. 
The points at 7 = 0.01 were omitted for levels 1, 2, and 3 to reduce the chance of introducing 
optimistic biases by the curves' leveling off at higher 7, although this effect has not been observed. 
We obtained the fits by starting with a least-squares fit of the log-log plots and then using a fastest- 
descent method to optimize the likelihood. We computed standard deviations by resampling the 
data according to the fitted curve and repeating the fitting process. The fitted curves are shown 
with the data in Fig. 16. C(l) and D(l) were computed from c(Z), c(l + 1), d(l) and d(l + 1) by 
solving the equations. Their uncertainty intervals are found by linear error analysis. 

TABLE. 5: Table of d(t), c(l), D(l) and C(l) with uncertainty intevals based on standard 

deviations. 



Level 




2 
3 
4 



1 



37.0±0.1 35.2±1.5 29.94±1.13 3.43±0.01 

1.06±0.01xl0 3 4.47±0.18xl0 3 4.87±0.14 1.69±0.14 

2.18±0.02xl0 4 7.95±1.01xl0 6 3.01±0.70 NA 

2.39±0.86xl0 7 NA NA NA 
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FIGURE. 16: Fits to the error data for the logical CNOT. The model assumed is pd(l) = 
d(l)jf( l+1 > ,p c (l) = c(l)jf" +2 >, where / is the Fibonacci sequence. The constants d(l) and c(l) 
are obtained by a maximum-likelihood method from the data points in the range of the solid lines. 
The gray dashed lines are extrapolations. 
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The constants D(l), C(l) are significantly reduced for going from level 2 to level 3 compared 
to going from level 1 to 2. Level 2 is the first stage of using C 6 and the first where error correction 
can be used. One may conjecture that the level 2 to level 3 behavior persists or improves at higher 
levels, as is the case for D(3) compared to D{2). For the purposes of modeling errors we use this 
conjecture to recursively obtain d(l + 1) and c(l + 1) with D{2) and C(2) in place of D(l) and 
C{1) for / > 2. It is an interesting exercise to use the recursion implied by the D(l) and C(l) to 
obtain a threshold. The threshold thus obtained is conjectural, because the approximations made 
are not strictly valid, particularly at high 7, and because of the extrapolation of D{1) and C(l). By 
implementing the recursion numerically, we obtained a threshold of ^0.028 for this architecture, 
which does not seem unreasonable in view of the data shown in Fig. 3. Of course, the resource 
overheads diverge as any such threshold is approached from below. 

We return to the question of resource requirements for implementing gates at 7 < 0.01. As 7 
decreases, the physical resources required per logical CNOT are reduced in two ways. First, the 
state preparation success probabilities at a given level of concatenation increase, see Tables 2, 3 
and 4 and Fig. 15. This increase is particularly notable near the upper limit for 7. Second, fewer 
levels of concatenation suffice for achieving sufficiently low logical errors. Consider implementing 
a computation C with the product of the number of logical gates and average number of qubits per 
gate given by KQ. For computations that are not maximally parallel, this quantity should include 
memory delays in the gate count. To simplify the resource estimates, logical errors and physical 
gate counts are given in terms of "effective" error and physical gate counts per (logical) qubit and 
gate. For example, consider the logical cnot in the C^/Cq architecture. It acts on two logical qubit 
pairs, so its effective error per qubit is 1/4 of its total error. Similarly, its effective physical gate 
count per qubit is 1/4 of the total gate count. With this simplification, we can estimate the total 
error and number of physical gates for implementing the computation C by multiplying KQ by the 
the appropriate effective quantity and a nontransversal-gate state preparation overhead. In making 
these estimates, we assume that (1) each of the logical gates needed by C can be implemented 
with effective error similar to that of the logical CNOT, (2) the implementation can take advantage 
of both logical qubits in the logical qubit pairs and (3) overhead for addressing individual logi- 
cal qubits in the pairs is accounted for in the nontransversal-gate state preparation overhead. The 
assumptions require that the nontransversal-gate state preparations have the property that logical 
gates used in the preparations do not contribute additional error, as is the case for the 1 7r/8) state 
preparation described above. The reason for not including the nontransversal-gate state prepara- 
tion overhead in the effective quantities per qubit and gate is that this overhead can be optimized 
independent of the architecture and depends on the choice of elementary nontransversal gates. It 
is expected to add one to two orders of magnitude to the total implementation resources. 

We estimate the optimal effective number pcnot(i^<5, 7) of physical CNOTs per qubit and gate 
as a function of the size KQ of C and the EPG parameter 7. As noted above, other physical 
resources such as state preparation and measurement are comparable. We optimize pcnot(i^<5, 7) 
by choosing the level I of the C 4 /C 6 architecture and use it to repeatedly implement C until no 
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uncorrectable error is detected in the logical gates. At this point the output of C must be correct 
with probability at least 2/3. The value of 2/3 is chosen to be strictly between 1/2 and 1 but 
otherwise not crucial. At the minimizing level, pcnot(Ji<5, 7) is computed as the product of the 
average number of times C must be implemented until no error is detected and 1/2 of the number of 
physical CNOTs, rbell(/, 7), needed to prepare a logical Bell state (neglecting the relatively small 
additional number of physical cnots needed for transversal gates and for using the Bell state in an 
error-correcting teleportation). The factor of 1/2 accounts for having two qubits in each block of 
the C4/ Cq concatenated codes. The probability of success of a single instance of C can be estimated 
as (1 —pd(l, l)/4:) KQ , which is approximately correct for our accounting using effective errors per 
qubit and gate, provided that Pd(l, 7) is small. On average, C must be tried 1/(1 ~ pd(l,^)/^) KQ 
times to successfully obtain the output. The conditional probability of a successful output's being 
correct is (1 — p c (l,^) / A) KQ . Thus, given KQ, the optimal pcnot(i^<5, 7) is obtained as the 
minimum over I of |rbell(/, 7)/(l - p d (l,^)/4) KQ subject to (1 - p c (l,^)/4) KQ > 2/3. Curves 
for pcnot(i^Q, 7) for various KQ as a function of 7 are plotted in Fig. 5. 

The quantity pcnot(KQ, 7) gives the overall "work" overhead for implementing a computation 
using the C^/Cq architecture, but does not differentiate between parallel and sequential resources 
or indicate the number of physical qubits needed per logical qubit ("scale-up"). The C^/Cq archi- 
tecture does not determine these resources uniquely, as they depend on how the trade-off between 
parallelism and requirements for memory is resolved. In the case of maximum parallelism, the 
scale-up is close to pcnot(KQ, 7). If minimum parallelism is used, this can be reduced to a small 
multiple of the minimum scale-up associated with the C^/Cq concatenated code at the level / that 
is used. This minimum scale-up is given by 3' _1 2 (taking into account that there are two qubits 
per block of 3 i ~ 1 4 qubits). If there is no memory error at all, the additional overhead per block can 
be minimized by operating on only one block at a time. Otherwise, for each block, two additional 
blocks are needed in error-correcting teleportation. Logical Bell state preparation requires an ad- 
ditional overhead depending on the degree of parallelism required. If the subblock teleportations 
in the preparation are done in parallel, and taking into accounting lower level Bell state prepara- 
tions, two more blocks or equivalent are needed for each level other than the first. This means that 
1 + 2(1 — 1) blocks are needed per computational block. The |vr/8) -state preparation has additional 
overhead. Depending on how it is implemented it may require up to 14 blocks with their own 
overhead of 1 + 2(1 — 1) or more blocks each. The contribution of |7r/8)-state preparation can 
be minimized by implementing the logical part of the computation sequentially but using memory 
steps to remove the effects of memory error as needed. Based on these estimates, the scale-up for 
low but not minimum parallelism is rs 3' _1 2(1 + 2(1 — 1)). At levels 2, 3, 4, this evaluates to 18, 
90, 378, respectively. 

The error-correcting C^/Cq architecture is relatively simple and designed to work well at high 
EPGs. However, there is a minimum resource cost (of order 10 3 per gate and qubit) to use it since 
error-correction kicks in only at level 2. As a result, at low EPGs, architectures such as Steane's 9 
based on more efficient codes with little or no concatenation are more efficient and have more 
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flexibility in achieving the desired logical error probabilities. This effect can be quantified by com- 
paring the C^/Cq architecture to that of Steane using the illustrative example at 7 ps 1CT 4 worked 
out in [9]. Steane's error model differs from ours in that preparation, measurement and one-qubit 
gates all have error probability 7. In our analysis, preparation and measurement errors are 47/ 15, 
which we justified with a purification scheme. This scheme could also be used in the context of 
Steane's error model. We compare the two architectures based on the resources per logical qubit 
of one logical step such as a CNOT, for which the C 4 /C 6 architecture does not require one-qubit 
gates other than preparation and measurement. Steane's error model also includes memory error 
(7/IOO per step) and accounts for measurement times in excess of gate times (25 times the gate 
time). In our model and in the maximally parallel setting, this would require an additional error 
of 7/4 per qubit at the end of state preparation to delay for measurement outcomes that determine 
whether the state is good or not. The comparison is also complicated by Steane's method deferring 
some error correction to later steps (we do not account for the implicit overhead in this) and by 
our method having both detected and conditional logical error, with the latter typically being much 
lower (we use only the detected error for comparison). 

Steane's example is based on a [[127, 43, 13]] code, which encodes 43 logical qubits. Full error 
correction of a block requires about 1.8 x 10 4 physical CNOTs on average and has a probability of 
logical error (called "crash probability" in [9]) of ~ 3 x 10~ 10 . This translates to ~ 420 physical 
CNOTs per qubit and gate and an effective error of ~ 7 x 10~ 12 per logical qubit. The C^/Cq 
architecture at level 3 uses 4158.1 physical CNOTs for an error-correcting teleportation. Including 
36 physical gates for a transversal operation, this gives w 2100 physical CNOTs per qubit and 
gate. The detected error probability for a logical CNOT was estimated above as 2.2 x 10~ 8 , which 
translates to w 5.5 x 10~ 9 effective error per qubit and gate. To meet the effective error probability 
achieved by Steane requires another level of encoding. At level 4, the C^/Cq architecture uses 
2.6 xlO 4 physical CNOTs per qubit and gate and with a detected error probability of 6 x 10~ 14 
per qubit and gate. One can also compare the scale-up for the two architectures at 7 = 10~ 4 : 
Steane's example has a scale-up of between 10 and 20 compared to from 378 to over 2000 for 
the C^/Cq architecture at level 3, depending on parallelism. As expected, Steane's architecture 
requires fewer resources at low EPGs. It is however notable that the C^/Cq architecture requires 
only two orders of magnitude more resources at EPGs as low as 7 = 10~ 4 . The C A /C§ architecture 
has the advantage of simplicity and of yielding more reliable answers, conditional on having no 
detected errors. 
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